Neurological intervention transition model for dynamic prediction of good outcome in spontaneous subarachnoid haemorrhage

Deterioration of neurovascular conditions can be rapid in patients with spontaneous subarachnoid haemorrhage (SAH) and often lead to poor clinical outcomes. Therefore, it is crucial to promptly assess and continually track the progression of the disease. This study incorporated baseline clinical conditions, repeatedly measured neurological grades and haematological biomarkers for dynamic outcome prediction in patients with spontaneous SAH. Neurological intervention, mainly aneurysm clipping and endovascular embolisation, was also incorporated as an intermediate event in developing a neurological intervention transition (NIT) joint model. A retrospective cohort study was performed on 701 patients in spontaneous SAH with a study period of 14 days from the MIMIC-IV dataset. A dynamic prognostic model predicting outcome of patients was developed based on combination of Cox model and piecewise linear mixed-effect models to incorporate different types of prognostic information. Clinical baseline covariates, including cerebral oedema, cerebral infarction, respiratory failure, hydrocephalus and vasospasm, as well as repeated measured Glasgow Coma Scale (GCS), glucose and white blood cell (WBC) levels were covariates contributing to the optimal model. Incorporation of neurological intervention as an intermediate event increases the prediction performance compared with baseline joint modelling approach. The average AUC of the optimal model proposed in this study is 0.7783 across different starting points of prediction and prediction intervals. The model proposed in this study can provide dynamic prognosis for spontaneous SAH patients and significant potential benefits in critical care management.


Data collection and visualisation
Patient recruited in this study were selected from MIMIC-IV, a publicly available database sourced from the electronic health record (EHR) of the Beth Israel Deaconess Medical Center between 2008 and 2019 7 .766 patients with spontaneous SAH as the primary diagnosis were initially included in the study, identified by ICD-9-CM code 430 or ICD-10-CM code I60 and their specifiers.65 patients were excluded due to too short hospital stay (< 24 h) or having no identified records in vital signs and neurological assessments.
A final total of 701 patients were included in this study, all patients aged 18 and over.Among them, 409 (58.35%) patients were females, and the median age was 59 (IQR: 50-70).Essential hypertension is the most common clinical condition (340, 48.50%), followed by hydrocephalus (221, 31.53%),respiratory failure (149, 21.56%), and cerebral oedema (131, 18.69%).Clinical outcomes were measured by discharge destinations at the end of study period.Good outcome (343, 48.93%) was defined by returning home or rehabilitation, while poor outcome (358, 51.07%) includes mortality, long-term acute care, and hospice care.The clinical characteristics of the dataset are shown in Table 1 www.nature.com/scientificreports/information in the database, while underlying conditions and complications are identified with ICD-9-CM and ICD-10-CM codes.Six indices were included as dynamic covariates to be selected for prognostic modelling, including one neurological grade, i.e., GCS score, three haematological biomarkers, i.e., creatinine, WBC, glucose, and two vital signs, i.e., systolic blood pressure (SBP) and oxygen saturation (SpO2), which had been included and shown to be valuable in SAH prognosis research [8][9][10][11][12][13] .Table 2 presents the average values and standard deviations of these dynamic covariates, calculated among all observations of patients for both good and poor outcome groups to provide a general statistical overview of each dynamic covariate.As the frequency of each index measure can vary depending on severity of patients' clinical conditions, different phases of the disease and the judgement of healthcare providers, we have incorporated daily average values of dynamic covariates into prognostic modelling to normalise the frequency of measurements across patients, thereby reducing potential biases due to variations in measurement frequencies.
Figure 1 visualises the trajectories and corresponding 95% confidence interval of these dynamic covariates during the study period with locally estimated scatter-plot smoothing (LOESS) method.The LOESS method was separately applied to each group, estimating a smoothed curve that best represents the trend of each dynamic covariate for each group.
According to the trajectory visualisation in Fig. 1 and statistics shown in Table 2, GCS score is a discriminatively powerful dynamic covariate between two groups, while the trajectories of WBC and glucose also exhibit good discrimination between good outcome and poor outcome.The other three covariates, however, do not exhibit discriminative power.Visualisation of the trajectories of dynamic prognostic covariates allow to observe their patterns and trends for each group and provide an initial insight into the dynamic nature of these covariates.The significance of prognostic values of these dynamic covariates needs to be jointly analysed with baseline covariates and intermediate events, which will be presented later.

Prognostic modelling
The prognostic modelling process starts with examining the effects of baseline covariates on clinical outcomes, which can be modelled using a Cox proportional hazard model.The Cox model is the most widely used method in survival analysis and can be used to investigate the effects of baseline covariates on clinical outcomes, given by 14 : where h i (t) denotes the instantaneous rate of experiencing a good clinical outcome for patient i at time t, while ω j i is the value of the j th baseline covariate for patient i with corresponding coefficient γ j .A positive coefficient indicates increasing chance of a good outcome, whereas a negative coefficient implies higher risk of poor outcomes.The effect of j th baseline covariate on outcome is measured by hazard ratio (HR), computed as exp(γ j ).
Estimated HRs of all analysed baseline covariates are less than 1, indicating lower changes of good outcomes, as shown in Table 3. Results in univariate analysis revealed effects of single covariates on clinical outcome, while multivariate analysis considered the simultaneous effects of multiple covariates, estimating the effect of each covariate while accounting for other covariates.Among these baseline covariates, cerebral oedema, cerebral infarction, respiratory failure, hydrocephalus and vasospasm are significant in univariate survival analysis and remain significant in multivariate survival analysis.It is noted that HRs of significant baseline covariates increase towards 1 in multivariate analysis compared to univariate analysis, indicating that their effects on clinical outcomes are less pronounced when other variables are accounted for.With multiple baseline covariates obtained, the prognostic model doesn't need to rely on a single baseline covariate to predict outcomes that the effect of that covariate may be overestimated.Instead, we can have a more comprehensive understanding of the patient's condition emerges to make more comprehensive and accurate predictions on prognosis.
As dynamic prognostic covariates can be measured with errors and influenced by caregivers, e.g., human biases on neurological assessments, linear mixed-effect models (LMM) are adopted to model the longitudinal properties of these dynamic covariates to handle unobserved heterogeneity.Moreover, since each patient corresponds to repeated measurements of dynamic covariates, the measurements within the same patient are likely to be correlated.An LMM can account for this within-subject correlation by including random effects.Fixed-effect and random-effect terms in an LMM for a dynamic covariate respectively describe its population-level mean trajectory and individual-specific deviations.
The longitudinal pattern of the j th dynamic covariate for the i th patient at time t is thus given by: where y j i (t) denotes the observed value of j th dynamic covariate, measured with error, while m j i (t) is the cor- responding unobserved true value, composed of fixed-effect term β j x T j i (t) representing the overall trend for all patients, and random-effect term b j i z T j i (t) for explaining patient-specific deviations from the overall trend.These dynamic covariates are then simultaneously modelled with baseline covariates to incorporate up-to-date dynamic prognostic information into prognostication with joint modelling approach 15 .A survival sub-model incorporating multiple dynamic and baseline covariates is given by: In comparison with Eqs.(1, 3) simultaneously analyses the effects of baseline and dynamic covariates on clinical outcomes by incorporating longitudinal sub-models for these dynamic covariates as described in Eq. ( 2).
Next, there is a further source of prognostic information that can be incorporated in prognostic modelling, which are intermediate events, specifically neurological interventions in this study.For the prognostication of spontaneous SAH, neurological interventions can have a direct impact on disease progression and clinical outcomes, and can be incorporated into modelling framework for more comprehensive prognostication.
To account for the effect of neurological interventions on disease progression, the longitudinal sub-model of prognostic modelling, denoted by Eq. ( 2), can be reformulated as: (1) where ti is the time of the neurological intervention of the i th patient, while t + denotes the time relative to neurological intervention.For the i th patient, t + = max(0, t − t i ) .The effects of neurological interventions are modelled as additive terms on both overall population-level trend and individual-specific terms of dynamic covariates after the time point of a neurological intervention.Corresponding, Eq. ( 3) can be extended as: Equation 5 describes the prognosis of spontaneous SAH into two states, obtained from the combination of Eqs.(3 and 4).Patients having not received neurological interventions are regarded in the first state, while patients already treated with neurological interventions are in the second state.ti is the time point of state transi- tion.Thus, this model is termed as neurological intervention transition (NIT) joint model, and its performance as a prognostic model will be compared with baseline joint models that do not include neurological interventions in prognostic modelling.
Parameters included in this model are estimated using the Bayesian approach, wherein the inference relies on a joint posterior distribution as the product of the observed data's joint likelihood and prior distribution.In this study, prior beliefs on the parameters in joint modelling framework are from the values of parameters separately estimated in survival and longitudinal sub-models.The Bayesian approach is implemented via Markov chain Monte Carlo (MCMC) methods, with Gibbs and Metropolis-Hastings algorithms for sampling from distributions.

Simultaneous analysis on longitudinal and survival data
The prognostic value of each dynamic covariate was measured by its HR, calculated by the exponential of α j , when jointly analysed and adjusted for the five significant baseline covariates in Table 3. Table 4 presents the HRs of included dynamic covariates with both baseline and proposed NIT joint modelling approach.
According to the results in Table 4, GCS score is a strong independent dynamic prognostic factor for clinical outcomes.It is calculated that one unit increase in GCS score is associated with 173.79% higher chance of a good outcome when solely jointly modelled with baseline covariates.Presence of WBC abnormality and hyperglycaemia are found to be negatively associated with good outcome at 14 days.
The associations are not significant when these two haematological biomarkers are solely modelled as dynamic covariates but become significant when jointly analysed with GCS score.Moreover, when jointly modelled with haematological biomarkers, the prognostic power of GCS score decreases.These suggest that these two haematological biomarkers can act as confounding variables that affect the associations between GCS score and outcomes.The inclusion of these two haematological biomarkers helps to control for this confounding effect and provide a more accurate estimate of the HR of GCS score.
Vital signs, however, are found not associated with clinical outcomes, with HR values around 1.This may be explained by their dynamic nature, that vital signs can fluctuate and change rapidly in response to various (4)  5 gives the prediction performance of different models, where T s denotes the starting point of predic- tion, and dt is the prediction interval.AUC represents the overall discriminative ability of the prognostic model in distinguishing patients with good and poor outcomes.Composition of patients varies with different T s and dt , which can lead to variations in performance across prediction intervals.Thus we use mean AUC, calculated by averaging the predictive performance over all T s and dt , to measure the overall prediction performance.Vari- abilities across starting points of prediction and prediction intervals for each model were also calculated, where low variability indicated the model was robust over time and could consistently provide reliable predictions on clinical outcomes.
In both baseline and NIT joint models, jointly analysing these three dynamic covariates increases the overall prediction performance compared to solely relying on the GCS score, since WBC and glucose levels can provide additional prognostic information on inflammatory or infectious process, and cardiovascular incidences, to the GCS score, which mainly represents a patient's level of consciousness and neurological function.Incorporation of multiple dynamic covariates can thus provide a multifaceted view on the evolution of a patient's clinical status.
Comparing the predictive performance across different prediction intervals, we can find that the worst predictive performance for all model settings is predicting from day 3, especially when predicting the outcome in the next two days.This can be explained by both the nature of spontaneous SAH and the dataset.Firstly, regarding the nature of spontaneous SAH, complications, e.g., vasospasm.are highly probable within this time period 16 .Thus lack of prognostic information about the time to complications during this critical period hinders our model's ability to capture essential prognostic information, leading to reduced predictive performance.Secondly, www.nature.com/scientificreports/calculations of AUCs are based on the comparison between model's predictions and actual outcomes within the prediction interval and can be impacted by few outliers or atypical cases.Therefore, we measure and compare the overall performance acorss different model settings by averaging AUCs across multiple prediction periods.Figure 2 compares the prediction accuracy with time between two modelling approaches and combinations of dynamic covariates, measured by marginal AUC by the time of prediction.It can be found that, with all combinations of dynamic covariates, the prediction performance of proposed NIT joint model is better than the corresponding model developed by baseline joint modelling methods.This figure also shows that the prediction performance of proposed NIT joint model with GCS score, glucose and WBC as dynamic covariates is good and consistent with different prediction time, where the AUC is at least around 0.75 for all starting points of prediction.Moreover, prognostic NIT joint model can provide good prediction performance in both acute phase ( T s + dt ≤ 3 , Mean AUC: 0.7683) and sub-acute phase ( 3 < T s + dt ≤ 14 , Mean AUC: 0.7797) of spontaneous SAH.
According to the predictive performance of different dynamic prognostic models, there are three main findings of this study.Firstly, repeated measured neurological status, measured by GCS score, is a strong dynamic predictor for clinical outcomes, adjusting for baseline clinical conditions.Incorporation of haematological biomarkers, i.e., WBC and glucose, can provide additive prognostic information and improve the prediction accuracy of prognostic models.
Secondly, intermediate events, i.e., neurological interventions in this study, provide prognostic information on the disease progression.Incorporating intermediate events as transitions between states of prognostication can add granularity to a prognostic model by dividing the overall prognosis process into multiple prognostic states, which allows for a more detailed analysis into disease progression.Compared with baseline joint models, NIT joint models have better predictive accuracy when including the same baseline and dynamic covariates.
Moreover, NIT joint models contribute to personalised prognosis.Modelling the time point of neurological intervention as individualised transition point of prognostic states adds individual-specific characteristics to the prognostic model so that a prognostic model can take individualised milestones in disease progression into account for personalised outcome predictions.
Finally, compared to baseline joint models, NIT joint models perform better in relative long-term outcome prediction.The average improvement of prediction accuracy from baseline to NIT joint model across four covariate combinations is 0.0212 for prediction in the sub-acute phase, while the overall improvement is 0.0074.This merit of NIT joint model may result from that it explains the change of disease progression, which has higher impact on the values of dynamic covariates in the sub-acute phase of prognostication.
The multivariate NIT joint model, incorporating GCS score, WBC and glucose as dynamic prognostic covariates is the optimal model in this study, increasing the predictive accuracy of outcome from 0.7422, in baseline joint model only considering the neurological grades of patients, to 0.7783.It can be used as an accurate and comprehensive clinical tool for dynamic personalised prognosis in patients suffering from spontaneous SAH, potentially benefiting disease progression monitoring, optimising treatment plans for better clinical outcomes.

Discussion
This study has proposed a multivariate NIT joint model for prognosis of spontaneous SAH.Compared to widely used clinical tools such as Hunt and Hess grade, which are highly dependent on patient's neurological status, the prognostic model explores the prognostic values of medical conditions, haematological biomarkers, and events of neurological interventions.It is thus suitable for modelling the multi-factorial mechanism of SAH prognosis.Moreover, prognostic information from medical conditions and haematological biomarkers are individualspecific, together with individualised disease progression modelled by individualised prognostic state transitions, making proposed NIT models in this study a good clinical tool for personalised prognosis.www.nature.com/scientificreports/ The optimal NIT joint model is composed of five baseline covariates, i.e., cerebral oedema, cerebral infarction, respiratory failure, hydrocephalus and vasospasm, three dynamic covariates, i.e., GCS score, WBC and glucose, and a state transition indicator, i.e., intermediate event of neurological intervention.This model can provide accurate outcome predictions across all starting points of prediction and prediction intervals, with the overall AUC 0.7783, respectively 0.7683 and 0.7797 in the acute and sub-acute phases of spontaneous SAH.
In analysis of the predictive power of neurological status, haematological biomarkers, and vital signs, GCS score was found to be the most valuable covariate in prognostication.This finding was consistent with the fact that neurological scales are widely used clinical tools for prognosis of SAH, which estimated patients' clinical outcomes based on results in neurological assessments.WBC and glucose are not independent prognostic factors but can provide additive prognostic information to neurological status and improve the model prediction accuracy.
Although the prognostic values of vital signs were found to be limited in this study, their variability, e.g., systolic blood pressure variability, which may indicate impaired blood pressure regulation and cardiovascular instability, is drawing research attention in prognosis studies 17 .With more frequent measurements of vital signs, their variability and the changes of variability during disease progression can be obtained, and can be potentially included as influential dynamic covariate for the prognostication of SAH.
This study provides a novel approach to incorporate different types of covariates and events into prognostication of spontaneous SAH, on the basis of joint modelling framework.This approach is also suitable and can be extended to model the prognosis of other cerebrovascular diseases due to its ability to incorporate different types of prognostic information.Incorporated prognostic information is from demographics, clinical conditions, neurological status, haematological biomarkers, vital signs, radiological findings, and some intermediate events that can have a direct impact on disease progression such as complications and neurological surgeries.Moreover, the Bayesian approach adopted for parameter estimation allows for incorporating expert knowledge into prior distributions, which facilitates its clinical use and improves model's interpretability for clinicians.
The findings of this study also contribute to the development of personalised prognosis, which is a tendency in critical care management.Personalised prognosis considers individual characteristics and can guide tailored treatment decisions, individualised follow-up and surveillance strategies, and resource allocation.Implementation of personalised prognosis in clinical practice by integration with EHR systems can help improve patient outcomes, enhance the collaboration between patients and caregivers, and support more patient-centred healthcare.This study is the first research taking all three types of individual-specific covariates and events, i.e., baseline clinical conditions, dynamic prognostic factors and clinical intermediate events, into prognostic modelling in studies of neurovascular diseases.Findings and approaches proposed in this study can thus potentially contribute to the development of personalised prognosis for neurovascular diseases (supplementary information file 1).
There are three main future directions of this study.Firstly, neurological intervention is not the only intermediate event that can provide additive prognostic information to baseline and dynamic covariates.Occurrences of major complications during critical care, e.g., re-bleeding, and delayed cerebral ischaemia (DCI), are also informative indicators for changes in disease progression, which often indicate worsening neurological conditions.Incorporating multiple intermediate outcomes and development of a multi-state prognostic model can potentially provide more accurate and comprehensive prognostication.Secondly, we have incorporated daily average values of dynamic covariates into prognostic modelling.Thus, our prognostic model may not fully capture within-day changes in patients' conditions, and the precision of predictions may be impacted, which can be improved with increased frequencies in measuring dynamic covariates.Finally, application of deep learning strategies in the prognosis of spontaneous SAH can potentially help capture informative features for prognosis research, thus improving model's flexibility and adaptability, which can be potentially integrated with our proposed modelling framework for higher predictive performance 18 .

Figure 2 .
Figure 2. Comparisons in prediction accuracy between baseline and NIT joint models.

Table 1 .
. Demographics, e.g., age and female, are recorded in patients' clinical Clinical characteristics of included dataset.

Table 2 .
Statistics of all observations of six dynamic covariates.

Table 3 .
Results of univariate and multivariate survival analysis of baseline covariates.Values presented in bold indicate statistical significance.

Table 5 .
Prediction performance of different joint model settings.